Optimize Your Dev Setup For Evals w/ Cursor Rules & MCP
This is a guide to help you optimize your dev setup to get the most out of AI. It’s showed with specific examples for eval frameworks, but the concepts are universally applicable to any tool. It’s a guide to help you understand the different layers of context and how to use them to your advantage for your AI tools.
This content was originally created and taught as part of Hamel and Shreya’s AI Evals Course on Maven (30% off with that link!). If you are building AI tools or products, it’s a must-take course. Watch the original video here.
Get 30% off my Context Engineering course with code ISAAC
1 Why bother?
- Better AI assistance
- Training cutoff may mean outdated info
- Spending time determining what’s important to your project is good - Forces you to understand your tools better
- Make AI better match your taste
- Show how AI integrates with your other tools, abstractions, or framework
2 The Three Layers
- General Context: Generic tool that works on almost anything
- Repomix, GitMCP, etc.
- Curated Context: Curated by an expert, such as the tool author
- Library provided MCP, llms.txt, etc.
- Personalized Context: Context that you can create that’s unique to your project
- Only you can make this and it’s uniquely tailored to your taste and needs
3 General Context
General context is a good starting point when exploring or unsure about a tool. It’s generic and not optimized for specific needs. It’s fast and easy to set up. It’s useful for quick experimentation and exploration, but it’s not the best for long-term use. Here’s a few examples of general context tools.
- Good starting point when exploring or unsure about a tool
- Generic and not optimized for specific needs
- Fast and easy to set up
- Useful for quick experimentation and exploration
RepoMix is a tool that lets you take a github repo and concatenate all the files based on a pattern into a single file. This is useful for getting lots of context into a single file that you model can easily understand.
3.0.1 Repo Mix
3.0.2 Paste Max
PasteMax is a tool that is a Native App you can install on your machine that lets you concatenate content and files directly in a nice UI. It serves a similar purpose to RepoMix, but it’s a desktop app and has a different UX that many prefer.
3.0.3 Git MCP
GitMCP is a tool that lets you use a github repo as a context source by creating an MCP server to let agents interact with the repo via tools. It’s a different approach to general context, and it’s a good way to be able to let agents interact with the repo via tools. It also has a nice web chat interface that’s great for quick questions to explore the repo.
- Repository chat interface: https://gitmcp.io/UKGovernmentBEIS/inspect_evals/chat
4 Cursor Rules
Cursor rules let you customize how the curso AI assistance will use your rules, this is extremely helpful and should not be overlooked.
4.1 Rule Creation
You can create rules in the .cursor/rules directory, but there’s a cursor command that does it for you.
4.2 Choosing a Rule Type
Cursor lets you choose a rule type to determine when the models will use the rule context. This is a great way to give a bit more control over model context.
- Always Apply - Core rules that should never be violated
- Apply Manually - Rules you trigger when needed. If you’re really bad at thinking about context, and you’re not going to put a lot of effort in then don’t use this.
- Apply Intelligently - This is a great thing to use
- Apply to Specific Files - File-type or path-specific rules. This can be nice, but you’re liable to set it to a path, restructure something, then forget and this rule will be accidentlaly deprecated.
5 Curated Context
Curated context is a good starting point when you want to explore more deeply. It’s provided by the tool author in some format and it’s a good way to get a lot of context into a single file that you model can easily understand. This will usually be better than general context, but still not a very tiny investment to set up. However, not every tool or library provides curated context.
- You want to explore more deeply
- It’s provided by the tool author in some format
5.0.0.1 MCP (Model Context Protocol)
Braintrust provides a MCP server that your AI agents can use. This is a lot like GitMCP we saw above, but created by the Braintrust team specifically for Braintrust so it’s much better.
- Official MCP server: https://www.braintrust.dev/docs/reference/mcp
- Provides structured access to Braintrust functionality
To set up the MCP server with Cursor, you can use the following settings.
5.0.0.2 LLMs.txt
Phoenix provides an file based in the llms.txt spec format. This is a format that allows for use as static context in a flat file, or as a tool that an MCP server can use to crawl it agentally.
Standard Format for Pheonix: https://arize.com/docs/phoenix/llms.txt
Langchain MCP documentation tool: https://github.com/langchain-ai/mcpdoc
5.0.0.3 llms.txt
For Pheonix, I provided a link to lanchain’s MCP server that lets you use the llms.txt file as a tool. For Inspect, let’s use the llms.txt spec to use the flat files as context to see the difference.
- Standard format: https://inspect.aisi.org.uk/llms.txt
- Full version: https://inspect.aisi.org.uk/llms-full.txt
- Individual Pages as Context: https://inspect.aisi.org.uk/index.html.md
Manual curation of context is also a great way to have the most control. There is typically a tradeoff between the amount of time you spend curating the context and the quality of the context. You will get better results if you spend more time carefully curating the context, and whether that payoff is worth the time is dependent on a few things:
- How long will you be using this tool and how much? If you use it daily and will for years, it’s worth the time to curate the context.
- How good is the library provided context? If they have carefully curated the context for llms specifically and the tool authors use it themselves, the often it works pretty great out of the box. Sadly, most tools don’t do this (yet).
5.0.0.4 Jina AI
Jina AI is an easy way to turn any html page into a markdown file. Markdown is much better context for HTML, because there’s loads of unneccesary noise in the HTML that doesn’t add value (styling classes, js interactivity, server calls, social media icons, etc.)
5.0.0.5 Copy Outer HTML
By going more manual you can remove the noise and get a more focused context. This is a great way to get a more focused context, but it’s a more work. The main benefits here over Jina AI is you often can remove lots of unneeded sections like footers, table of contents, sidebars, navigation, etc.
Copying the outer HTML and feeding that into Web2Md is the best workflow I’ve found for this.
5.0.0.6 Web 2 MD
6 Personalized Context
Personalized context is the most time consuming and the most effective. It’s not something anyone can create for you because it matches your taste and decisions made for you specific project.
- You know you’re going to be using this thing for months
- It’s worthwhile to invest a bit of time for better AI assistance
Let’s look at some examples of diffs that you may want to do to your context for more personalized context.